Replication Package for "Ethnic Media and the Mobilization of Identity"

Giacomo Lemoli (giacomo.lemoli@iast.fr)


SOFTWARE 
R version 4.1.0 (2021-05-18)

PACKAGES
Tidyverse:
  ggplot2 3.4.2
  dplyr 1.1.2
  tidyr 1.3.0
  readr 2.1.4
  purrr 1.0.1
  tibble 3.2.1
  stringr 1.5.0
  forcats 1.0.0

Other packages:
  stargazer 5.2.2
  interflex 1.2.6
  fixest 0.10.1
  ggpubr 0.6.0
  fwildclusterboot 0.3.6
  xtable 1.8.4
  tidylog 1.0.2
  haven 2.5.2
  labelled 2.8.0
  readxl 1.4.2
  viridis 0.6.3
  wordcloud2 0.2.1
  sf 1.0.13
  tmap 3.3.2
  patchwork 1.1.1


DATA AVAILABILITY NOTES
Part of the analysis relies on survey data from the Centro de Investigaciones Sociologicas (CIS),
specifically Figures 7, H1, H2, H3, H4, H5, H6. According to the CIS user conditions, raw data
cannot be shared to third parties without CIS approval. Therefore, data files are not posted
in the "Survey" sub-directory. Scripts to clean and analyze the survey data are provided.
The raw data to replicate the analysis can be obtained for free from CIS through a new request:

Visit the CIS data portal: https://www.cis.es/catalogo-estudios/resultados-definidos/buscador-estudios

In the internal search engine, under the field "Codigo estudio", enter the survey numeric IDs one by one:
- Survey # 2052 (1993) - Language use in bilingual communities
- Survey # 2296 (1998) - Language use in bilingual communities
- Survey # 2096 (1994) - Social and political situation in the Basque Country 
- Survey # 2282 (1998) - Social and political situation in the Basque Country 
- Survey # 2407 (2001) - Social and political situation in the Basque Country 
- Survey # 2593 (2005) - Social and political situation in the Basque Country

Click on the study and open the study page. In the right column, under "Documentos para descargar", click on "Fichero datos"

Fill the request form and send it.

The survey 2052 was distributed in .sav format. The data were imported directly into R using the haven package.
The other files were without a clear extension. They were manually opened in SPSS and extracted as .csv files. The extracted files are saved as "MD[svy number].csv"


REPLICATION NOTES
Figure F4 (word cloud of topics mentioned in local news in 1978) was generated using the package wordcloud2.
This package appears to randomly perturb the position of words in the figure at every execution, even when setting 
a random seed. Therefore the position of words as it is in the Appendix figure will not be exactly reproduced. 
This should affect only the position of the words, and not their relative sizes, which is what matters substantively.
The replication script provides code to print the underlying distribution of topics, which does not suffer from this problem.



MATERIALS 
The directory is organized in three sub-directories, each of which is self-contained and reproduces a different part of the paper.
The sub-directories have the same structure. Before running each script, remember to define the relevant sub-directory as the working directory.

- "Municipal": folder for the analysis at municipal level
	- "Code": containts scripts to replicate the municipal-level results
			- "analysis_municipal_level.R": replication script
			- "functions.R": auxiliary user-written functions for the analysis 
	- "Data": contains input data
			- data_final.RData: dataset with municipal-level variables
			- "Shapefiles": folder that contains the shapefiles for plotting the maps
- 
	- "Output": folder where output is saved
			- "Tables": folder to save Tables 1, 2, 3, 4, 5, A1, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10
			- "Figures": folder to save Figures 2, 3a, 3b, 4, 5, 6, B1, C1, C2, C3, D1, E1, G1

- "Programs": folder for the analysis of radio programs content
	- "Code": contains scripts to replicate the analysis of RPL programs
			- "analysis_programs.R": replication script 

	- "Data": contains spreahsheets created from manual coding of RPL transcripts
			- Radio Popular de Loyola Febrero 1967.xlsx: Programs and songs in February 1967
			- Radio Popular de Loyola Julio 1969.xlsx: Programs and songs in July 1969
			- Radio Popular de Loyola Enero 1974.xlsx: Programs and songs in January 1974
			- Radio Popular de Loyola Noviembre 1975.xlsx: Programs and songs in November 1975
			- Radio Popular de Loyola Marzo 1978.xlsx: Programs, songs, and mentions of events/people in March 1978

	- "Output": folder where output is saved
			- "Figures": folder to save Figures 1, F1, F2, F3 

- "Survey": folder for the analysis of CIS survey data
	- "Code": contains scripts to replicate the results with survey data
			- "prepare_survey.R": script that cleans the raw CIS data files for analysis
			- "analysis_survey.R": replication script

	- ["Data"]: folder to be created, to save the raw data and intermediate data

	- "Output": folder where output is saved
			- "Figures": folder to save Figures 7, H1, H2, H3, H4, H5, H6